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Question 6 
Intent of Question 


The primary goals of this question were to assess students’ ability to (1) interpret the slope of a regression 
line in context; (2) decide whether or not a model should be used for prediction; (3) describe the sampling 
distribution of a sample mean; (4) use the sampling distribution of a sample mean to obtain an interval of 
plausible values; (5) compare two different study plans and decide which one would provide a better 
estimator of the slope; (6) propose a different study plan to check an assumption. 


Solution 
Part (a): 


For each additional foot that is added to the width of the grass buffer strip, an additional 3.6 parts per 
hundred of nitrogen is removed on average from the runoff water. 


Part (b): 


No. This is extrapolation beyond the range of data from the experiment. Buffer strips narrower than 
5 feet or wider than 15 feet were not investigated. 


Part (c): 


Because the distribution of nitrogen removed for any particular buffer strip width is normally 

distributed with a standard deviation of 5 parts per hundred, the sampling distribution of the mean of 

four observations when the buffer strips are 6 feet wide will be normal with mean 33.8+3.6x6=55.4 
5 

parts per hundred and a standard deviation of ae ar = 2.5 parts per hundred. 


Vn 
Part (d): 
The distribution of the sample mean is normal, so the interval that has probability 0.95 of containing 


the mean nitrogen content removed from four buffer strips of width 6 feet extends from 
5b.4—1.96x2.5 =50.5 parts per hundred to 55.4+1.96x 2.5=60.3 parts per hundred. 
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Question 6 (continued) 
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If we think that the sample mean nitrogen removed at a particular buffer width might reasonably be 
any value in the intervals shown, a sample regression line will result from connecting any point in the 
interval above 6 to any point in the interval above 13. With this in mind, the dashed lines in the plots 
above represent extreme cases for possible sample regression lines. From these plots, we can see that 
there is a wider range of possible slopes in the second plot (on the right) than in the first plot (on the 
eft). Because of this, the variability in the sampling distribution of b, the estimator for the slope of the 
regression line, will be smaller for the first study plan (with four observations at 6 feet and four 
observations at 13 feet) than it would be for the second study plan (with four observations at 8 feet and 
four observations at 10 feet). Therefore, the first study plan (on the left) would provide a better 
estimator of the slope of the regression line than the second study plan (on the right). 








Part (f): 


To assess the linear relationship between width of the buffer strip and the amount of nitrogen removed 
from runoff water, more widths should be used. To detect a nonlinear relationship it would be best to 
use buffer widths that were spaced out over the entire range of interest. For example, if the range of 
interest is 6 to 13 feet, eight buffers with widths 6, 7, 8, 9,10, 11, 12 and 13 feet could be used. 


Scoring 


This question is scored in four sections. Section 1 consists of parts (a) and (b); section 2 consists of parts (c) 
and (d); section 3 consists of part (e); section 4 consists of part (f). Each of the four sections is scored as 
essentially correct (E), partially correct (P), or incorrect (I). 


Section 1 is scored as follows: 


Essentially correct (E) if the response includes the following two components: 
1. The response in part (a) is correct, as evidenced by the correct interpretation of the slope, in 
context. 
2. The response in part (b) is correct, as evidenced by the identification of extrapolation as the 
reason that the model should not be used AND the response is in context. 





Partially correct (P) if only one of the two components listed above is correct. 
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Question 6 (continued) 


Incorrect (I) if the response fails to meet the criteria for E or P. 


Notes 


Part (a) is incorrect if the interpretation is not in context or if the interpretation does not 
acknowledge uncertainty (for example, does not include “on average” or “about” or “approximately” 
or “predicted” when referring to the increase in nitrogen removed). 

Ideally a correct solution would also include units and make it clear that it is the approximate 
increase for each additional foot added to the buffer, but in the context of this larger investigative 
task, failure to do so is not sufficient to make part (a) incorrect. 

Part (b) is incorrect if extrapolation is not identified as the reason or if the response is not in 
context. 


Section 2 is scored as follows: 


Essentially correct (E) if the response includes the following two components: 


1. The response in part (c) states that the sampling distribution is normal AND provides a correct 

mean and standard deviation. 

2. The response in part (d) uses the correct mean and standard deviation of the sampling 
distribution — or incorrect values carried over from part (c) — AND a correct critical value 
(1.96 or 2) to compute a correct interval. 





Partially correct (P) if only one of two components listed above is correct. 


Incorrect (I) if the response fails to meet the criteria for E or P. 


Notes 


Stating that the sampling distribution is approximately normal is acceptable for part (c). 

Part (c) is incorrect if the response does not state that the distribution is normal or if an incorrect 
mean or standard deviation is given. 

Part (d) is incorrect if an incorrect critical value (for example a t critical value) is used or if an 
incorrect mean or standard deviation — other than incorrect values from part (c) — is used in the 
computation of the interval. 


Section 3 is scored as follows: 


Essentially correct (E) if study plan 1 is chosen in part (e), and the response demonstrates awareness of 
sampling variation in the estimates of the slopes of the regression lines, AND this is clearly 
communicated in the context of the two study plans. 


Partially correct (P) if study plan 1 is chosen in part (e), and the response demonstrates awareness of 
sampling variation in the estimates of the slopes of the regression lines, BUT the justification of the 
choice of study plan 1 is not clearly communicated. 


Incorrect (I) if the response fails to meet the criteria for E or P. 
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Question 6 (continued) 
Section 4 is scored as follows: 
Essentially correct (E) if the response specifies another study plan that uses eight buffer strips of at 
least three different specified widths, and the response indicates in how many locations each width 
will be used, OR the response makes it clear that at least three different buffer widths will be used and 
indicates that the buffer widths to be used will be spread out over the range of interest. 


Note: Specifying eight different widths is sufficient for an E in section 4. 


Partially correct (P) if the response does not meet criteria for E, BUT the stated plan uses at least three 
different widths. Widths need not be specified for a P. 


Incorrect (I) if the plan does not use at least three different widths. 


Each essentially correct (E) section counts as 1 point. Each partially correct (P) section counts as % point. 


4 Complete Response 

3 Substantial Response 
2 Developing Response 
1 Minimal Response 


If a response is between two scores (for example, 2% points), use a holistic approach to decide whether to 
score up or down, depending on the overall strength of the response and communication. 
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SECTION II 
Part B 
Question 6 
Spend about 25 minutes on this part of the exam. 
Percent of Section II score—25 


Directions: Show all your work. Indicate clearly the methods you use, because you will be scored on the 
correctness of your methods as well as on the accuracy and completeness of your results and explanations. 


6. Grass buffer strips are grassy areas that are planted between bodies of water and agricultural fields. These strips 
are designed to filter out sediment, organic material, nutrients, and chemicals carried in runoff water. The figure 
‘ below shows a cross-sectional view of a efass bi buffer strip,that has been planted along the Side’ of'a stream. 
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A study in Nebraska investigated the use of buffer strips of several widths betwee 5 feet and 15 feet}The study 
results indicated a linear relationship between the width of the ip.x), in feet, and the amounfof nitrogen 
removed from the runoff water (y), in parts per hundre e following model was estimated. 

y = 33.84+3.6x . 


(a) Interpret the slope of the regression line in the context of this question. 
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(b) Would you be willing to use this model to predict the amount of nitrogen removed for grass barter shire eta 
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A scientist in California wants to know if there is a similar relationship in her area. To investigate this, she will 
place a grass buffer'gtrip between a field and a nearby stream at each of € Wifferent locations and measure the 
arnount of nitrogen that the grass buffer strip, removes, in parts per hundred/from runoff water at each location. 
Each of the eight locations can accommodate a buffer strip between @ feet and_13 feet.in.width. The scientist 
wants to investigate which combination of widths wil] provide the best estimate of the slope of the regression 
line. 


Suppose the scientist decides to use buffer strips of width-6 feet at each of four locations and buffer strips of 
width {3 feet’at each of the other four locations. Assume the model; ¥ = 33.8 + 3.6x estimated from the 


h, Nebraska study is the true regression line in Californi fad the observations at the different locations are 
i normally distributed with standard deviation ofS parts per hundred. 


} (c Describe the sampling distribution of the sample mean of the observations on the amount of nitrogen 
removed by the;foth buffer strips with widths of 6 feet.» 
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(d) Using your result from part (c), show how to construct an interval that has probability 0.95 of containing the 
sample mean of the observations from four buffer strips with widths of 6 feet. 
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For the study plan being implemented by the scientist in California, the graph on the left below displays intervals 
that each have probability 0.95 of containing the sample mean of the four observations for buffer strips of width 
6 feet and for buffer strips of width 13 feet. A second possible study plan would use buffer strips of widtt(S Peet 
at four of the eight locations and buffer strips of widt 16 Jet at the other four locations. Intervals that each have 
probability 0.95 of containing the mean of the four observations for buffer strips of width 8 feet and for buffer 
strips of width 10 feet, respectively, are Shown in the graph on the right below. 
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If data are collected a a aan a nple, ean will be computed for the four observations from 
buffer strips of width 6 feet.and a s&cond sample mean will be computed for the four observations from buffer 


strips of width 13 feet. The estimated regression line for those eight observations will pass through the two 
" sample means. If data are collected for the second study plan, a similar method will be used. 


(e) Use the plots above to determine which study plan, the first or the second, would provide a better estimator 
of the slo t! «as : : 
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The previous parts of this question used the assumption of a straight-line relationship between the width of : 


the buffer strip and the amount of nitrogen that is removed, in parts per hundred. Although this assumption 
was motivated by prior experience, it may not be correct. Describe another way of choosing the widths of 
the buffer strips at eight locations that would enable the researchers to check the assumption of a straight- 
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STATISTICS 
SECTION I 
Part B 
Question 6 
Spend about 25 minutes on this part of the exam. 
Percent of Section II score—25 


Directions: Show all your work. Indicate clearly the methods you use, because you will be scored on the 
correctness of your methods as well as on the accuracy and completeness of your results and explanations. 


6. Grass buffer strips are grassy areas that are planted between bodies of water and agricultural fields. These strips 
are designed to filter out sediment, organic material, nutrients, and chemicals carried in runoff water. The figure 
below shows a cross-sectional view of a grass buffer strip that has been planted along the side of a stream. 
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Stream 





A study in Nebraska investigated the use of buffer strips of several widths between 5 feet and 15 feet. The study 
results indicated a linear relationship between the width of the grass strip (x), in feet, and the amount of nitrogen 
removed from the runoff water (y), in parts per hundred. The following model was estimated. 


y = 33.84 3.6x 
(a) Interpret the slope of the regression line in the context of this question. 
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(b) Would you be willing to use this model to predict the amount of nitrogen removed for grass buffer strips 
with widths between 0 feet and 30 feet? Explain why or why not. 


We would wok Le willi £o por A fe appoint 
he, Tere 4 6 buller gap ( ek tunr 
(trea wre del w err lA la one 


ot wi kro rLmrourd 
Duwd 4% feet (as owt 
Ctr to {feds } te we should wet erstrapelate. 


GO ON TO THE NEXT PAGE. 
-14- 


© 2011 The College Board. 
Visit the College Board on the Web: www.collegeboard.org. 


ee ee 


{f . 
ed &% o 


re, 


A scientist in California wants to know if there is a similar relationship in her area. To investigate this, she will 
place a grass buffer strip between a field and a nearby stream at each of eight different locations and measure the 
amount of nitrogen that the grass buffer strip removes, in parts per hundred, from runoff water at each location. 
Each of the eight locations can accommodate a buffer strip between 6 feet and |3 feet in width. The scientist 
wants to investigate which combination of widths will provide the best estimate of the slope of the regression 
line. 

Suppose the scientist decides to use buffer strips of width 6 feet at each of four locations and buffer strips of 
width 13 feet at each of the other four Jocations. Assume the model, y = 33.8 + 3.6x, estimated from the 
Nebraska study is the true regression line in California and the observations at the different locations are 
normally distributed with standard deviation of 5 parts per hundred. 


(c) Describe the sampling distribution of the sample mean of the observations on the amount of nitrogen 
removed by the four buffer strips with widths of 6 feet. 
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(d) Using your result from part (c), show how to construct an interval that has probability 0.95 of containing the 
sample mean of the observations from four buffer strips with widths of 6 feet. 
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For the study plan being implemented by the scientist in California, the graph on the left below displays intervals 
that each have probability 0.95 of containing the sample mean of the four observations for buffer strips of width 
6 feet and for buffer strips of width 13 feet. A second possible study plan would use buffer strips of width 8 feet 
at four of the eight locations and buffer strips of width 10 feet at the other four locations. Intervals that each have 
probability 0.95 of containing the mean of the four observations for buffer strips of width 8 feet and for buffer 
strips of width 10 feet, respectively, are shown in the graph on the right below. 
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If data are collected for the first study plan, a sample mean will be computed for the four observations from 
buffer strips of width 6 feet and a second sample mean will be computed for the four observations from buffer 
strips of width 13 feet. The estimated regression line for those eight observations will pass through the two 
sample means. If data are collected for the second study plan, a similar method will be used. 


(e) Use the plots above to determine which study plan, the first or the second, would provide a better estimator 
of the slope of the regression line. Explain your reasoning. 
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(f) The previous parts of this question used the assumption of a straight-line relationship between the width of 
the buffer strip and the amount of nitrogen that is removed, in parts per hundred. Although this assumption 
was motivated by prior experience, it may not be correct. Describe another way of choosing the widths of 
the buffer strips at eight locations that would enable the researchers to check the assumption of a straight- 
line relationship. 
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STATISTICS 
SECTION II 
Part B 
Question 6 
Spend about 25 minutes on this part of the exam. a 


Percent of Section II score—25 


Directions: Show all your work. Indicate clearly the methods you use, because you will be scored on the 
correctness of your methods as well as on the accuracy and completeness of your results and explanations. 


6. Grass buffer strips are grassy areas that are planted between bodies of water and agricultural fields. These strips 
are designed to filter out sediment, organic material, nutrients, and chemicals carried in runoff water. The figure 
below shows a cross-sectional view of a grass buffer strip that has been planted along the side of a stream. 
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A study in Nebraska investigated the use of buffer strips of several widths between 5 feet and 15 feet. The study 
results indicated a linear relationship between the width of the grass strip (x), in feet, and the amount of nitrogen 
removed from the runoff water (y), in parts per hundred. The following model was estimated. 


jy = 33.8 + 3.6x 
(a) Interpret the slope of the regression line in the context of this question. 
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(b) Would you be willing to use this model to predict the amount of nitrogen removed for grass buffer strips 
with widths between 0 feet and 30 feet? Explain why or why not. 
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A scientist in California wants to know if there is a similar relationship in her area. To investigate this, she will 
place a grass buffer strip between a field and a nearby stream at each of eight different locations and measure the 
amount of nitrogen that the grass buffer strip removes, in parts per hundred, from runoff water at each location. 
Each of the eight locations can accommodate a buffer strip between 6 feet and 13 feet in width. The scientist 
wants to investigate which combination of widths will provide the best estimate of the slope of the regression 
line. 


Suppose the scientist decides to use buffer strips of width 6 feet at each of four locations and buffer strips of 
width 13 feet at each of the other four locations. Assume the model, jy = 33.8 + 3.6.x , estimated from the 


Nebraska study is the true regression line in California and the observations at the different locations are 
normally distributed with standard deviation of 5 parts per hundred. 


(c) Describe the sampling distribution of the sample mean of the observations on the amount of nitrogen 
removed by the four buffer strips with widths of 6 feet. 
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(d) Using your result from part (c), show how to construct an interval that has probability 0.95 of containing the 
sample mean of the observations from four buffer strips with widths of 6 feet... - mere 
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For the study plan being implemented by the scientist in California, the graph on the left below displays intervals 
that each have probability 0.95 of containing the sample mean of the four observations for buffer strips of width 
6 feet and for buffer strips of width 13 feet. A second possible study plan would use buffer strips of width 8 feet 
at four of the eight locations and buffer strips of width 10 feet at the other four locations. Intervals that each have 
probability 0.95 of containing the mean of the four observations for buffer strips of width 8 feet and for buffer 
strips of width 10 feet, respectively, are shown in the graph on the right below. 
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If data are collected for the first study plan, a sample mean will be computed for the four observations from 
buffer strips of width 6 feet and a second sample mean will be computed for the four observations from buffer 
strips of width 13 feet. The estimated regression line for those eight observations will pass through the two 
sample means. If data are collected for the second study plan, a similar method will be used. 


(e) Use the plots above to determine which study plan, the first or the second, would provide a better estimator 
of the slope of the regression line. Explain your reasoning. 
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(f) The previous parts of this question used the assumption of a straight-line relationship between the width of 
the buffer strip and the amount of nitrogen that is removed, in parts per hundred. Although this assumption 
was motivated by prior experience, it may not be correct. Describe another way of choosing the widths of 
the buffer strips at eight locations that would enable the researchers to check the assumption of a straight- 
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Question 6 


Sample: 6A 
Score: 4 


The response in part (a) includes the phrase “expected value of y,” representing an acknowledgment of the 
uncertainty in the interpretation of the regression model. The answer to part (a) is correct. In part (b) the 
response states that using the model to predict outside the range of 5 feet to 15 feet can result in “unreliable 
data.” This captures the essence of extrapolation, so part (b) is answered correctly. Thus section 1, consisting 
of parts (a) and (b), was scored as essentially correct. The response in part (c) is correct because it includes 
the normality of the sampling distribution along with the correct mean, 55.4, and the correct standard 


deviation, Ue Part (d) is answered correctly because the correct interval is shown. The interval is not a true 


confidence interval, and the phrase “95% C.I” was considered extraneous. Therefore section 2, consisting of 
parts (c) and (d), was scored as essentially correct. Through indications on the plots and a written solution, 
the student communicates understanding of variability in the estimated slopes of the regression lines and 
describes how the variability in the estimated slopes would be less in the first study plan (on the left) than in 
the second study plan (on the right). Section 3, consisting of part (e), was thereby scored as essentially 
correct. In part (f) the response specifies that eight different widths will be used. This implies that one width 
will be assigned to each of the eight locations, so section 4, consisting of part (f), was scored as essentially 
correct. Because the four sections of the solution were all scored as essentially correct, the response earned a 
score of 4. 


Sample: 6B 
Score: 3 


The response in part (a) clearly indicates that the increase in the amount of nitrogen removed is on average 
3.6 parts per hundred for “every additional feet [sic] of grass strip.” This is a correct response. The response in 
part (b) warns that “we should not exstrapolate [sic]” and implies that the model may be different over the 
range O feet to 30 feet than it is over 5 feet to 15 feet. This is a correct response. Section 1, consisting of 

parts (a) and (b), was therefore scored as essentially correct. In part (c) the student provides the correct mean, 
56.4, and correctly identifies the distribution as normal, but provides an incorrect standard deviation of 5. 
Part (c) was scored as incorrect. No final interval is provided in the solution to part (d), which was also scored 
as incorrect. So section 2, consisting of parts (c) and (d), was scored as incorrect. The response includes 
potential regression lines on the plots in part (e), with labeling that clearly indicates the plot on the left 
(corresponding to the first study plan) represents “small variety” in the slope, whereas the plot on the right 
(the second study plan) exhibits “big variety.” The written part of the solution includes the phrase “the variety 
of lines on the first graph is much less,” communicating an understanding of the variability of estimated 
slopes in the two study plans. Section 3, consisting of part (e), was thus scored as essentially correct. The 
response in part (f) states that “[wle should take 8 different widths” at values 6 through 13 feet. The “8 points” 
in the response are the eight available locations. The solution specifies at least three distinct widths for buffer 
strips and indicates how many locations would be assigned to each width, so section 4, consisting of part (f), 
was scored as essentially correct. Because three sections were scored as essentially correct and one section 
was scored as incorrect, the response earned a score of 3. 
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Question 6 (continued) 


Sample: 6C 
Score: 2 
The interpretation of the slope in part (a) does not convey a sense of the uncertainty in the regression model. 
[he response would be correct if a phrase such as on average, expected, or predicted had been associated 
with the interpretation of 3.6 parts per hundred of nitrogen removed for each additional increase of one foot in 
the width of the grass strip. Although the word extrapolation does not appear in the student’s response to 
part (b), the sentence “We do not have confidence that this model will apply to widths between Oft to 30ft” 
captures the essence of extrapolation. The response to part (a) is incorrect, and the response to part (b) is 
correct, so section 1, consisting of parts (a) and (b), was scored as partially correct. The description of the 
sampling distribution of the sample mean in part (c) includes the correct mean, 55.4, but an incorrect 
standard deviation of 5. In addition, the response does not explicitly state that the sampling distribution 
follows a normal curve, so the answer to part (c) is incorrect. The correct probability interval appears in 

part (d), which makes the response to that part correct. The statement that begins “we are 95% confident ...” 
was considered extraneous, as this is an interval for sample means and not the population mean. Section 2, 
consisting of parts (c) and (d), was therefore scored as partially correct. The response in part (e) includes 
explicit mention of “the variability of the regression line’s slope” and, in the accompanying diagrams, 
illustrates that the potential variability in the slope is smaller for study plan 1 (on the left). The correct choice 
of “[t]he first one” (study plan 1), with the accompanying explanation, resulted in section 3, consisting of 

part (e), being scored as essentially correct. The solution for part (f) is based on the selection of random digits, 
but it does not ensure that at least three distinct widths would be selected. For this reason section 4, 
consisting of part (f), was scored as incorrect. Because one section was scored as essentially correct, two 
sections were scored as partially correct, and one section was scored as incorrect, the response earned a 
score of 2. 
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